Intonation Conversion from Neutral to Expressive Speech

نویسندگان

  • Christophe Veaux
  • Xavier Rodet
چکیده

Intonation is one of the most important factors of speech expressivity. This paper presents a conversion method for the F0 contours. The F0 segments are represented with discrete cosine transform (DCT) coefficients at the syllable level. Multi-level dynamic features are added to model the temporal correlation between syllables and to constrain the F0 contour at the phrase level. Gaussian mixture models (GMM) are used to map the prosodic features between neutral and expressive speech, and the converted F0 contour is generated under the dynamic features constraints. Experimental evaluation using a database of acted emotional speech shows the effectiveness of the proposed F0 model and conversion method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transformation of emotion based on acoustic features of intonation patterns for Hindi speech

Changes in intonation patterns may convey not only different meaning but different emotions even if the sequence of speech segments are same in a sentence. The patterns change depending upon structure and emotion of the sentence and require being stored in speech database. It is a difficult and time-consuming task to store all utterances of all the expressive style, which also consumes huge mem...

متن کامل

Clustering of foot-based pitch contours in expressive speech

Intonation generation is still one of the weak links in the textto-speech synthesis chain. It is a hard enough task to generate expressively neutral pitch contours, with accurate placement of accents and phrase boundaries, but to generate appropriate intonation for expressive speech is even more of a challenge. This paper is a first attempt at describing and categorizing the variation in pitch ...

متن کامل

Comparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis

Chironomic stylization is the process of real-time modification of intonation contours (f0 and tempo) using drawing/writing gestures with a stylus on a graphic tablet. The question addressed in this research is whether hand-made intonation stylization could improve or degrade expressivity and overall quality, compared to statistical modeling of prosody. A system for expressive TTS in French bas...

متن کامل

Data-driven emotion conversion in spoken English

This paper describes an emotion conversion system that combines independent parameter transformation techniques to endow a neutral utterance with a desired target emotion. A set of prosody conversion methods have been developed which utilise a small amount of expressive training data ( 15 min) and which have been evaluated for three target emotions: anger, surprise and sadness. The system perfo...

متن کامل

A comparison of voice conversion methods for transforming voice quality in emotional speech synthesis

This paper presents a comparison of methods for transforming voice quality in neutral synthetic speech to match cheerful, aggressive, and depressed expressive styles. Neutral speech is generated using the unit selection system in the MARY TTS platform and a large neutral database in German. The output is modified using voice conversion techniques to match the target expressive styles, the focus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011